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Synthesislng speech by converting phonemes to digital waveforms. 

This invention relates to synthetic speech and more 
particularly to a method of synthesising a digital waveform 
from signals representing phonemes. 
5 ' There are many circumstances, eg. in telephone 

systems, where it is convenient to use synthesised ' speech, 
in some applications the starting point is an electronic 
representation of conventional typography, eg. a disk 
produced by a word processor. Many stages of processing are 
10 needed to produce synthesised speech from such a starting 
point but, as a preliminary part of the processing, it is 
usual to convert the conventional text into a phonetic text, 
in this specification the signals representing such a 
phonetic text will be called "phonemes". Thus this invention 
15 addresses the problem of converting the signals representing 
phonemes into a digital waveform. It will be appreciated 
that the digital waveforms are commonplace in audio 
technology and digital-to-analogue converters and. loud 
speakers are well known devices which enable digital 
20 waveforms to be converted into acoustic waveforms. 

Many processes for converting phonemes into digital 
waveforms have been proposed and it is conventional to do 
this by means of a linked database comprising a large number 
of entries,' each having an access portion defined in phonemes 
• 25 and an output portion containing the digital waveform 
corresponding to the access phonemes. Clearly all the 
phonemes should be represented in the access portions but it 
is also known to incorporate strings of phonemes in addition. 
However, existing systems only take into account the phoneme 
30 strings contained in the access portions and do not further 
take into account the context of the strings. 
This invention, which is defined in the claims, " uses a 
linked database to convert strings of phonemes into digital 
waveform but it also takes into account the context of the 
35 selected phoneme strings. The invention also comprises a 
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novel form of database which facilitates the taking into 
account of the context and the invention also includes the 
method whereby the preferred database strings are selected 
from alternatives stored therein. 
5 A preferred embodiment of the invention will now be 

described by way of example. 

rffi flfiPAT. D BfifTBTPTION ■ 

This general description is intended to identify some 
of the important integers of a . preferred embodiment of the 
10 invention. Each of these integers will be described in 
greater detail after this general description. 

The method of the invention converts input signals 
representing a text expressed in phonemes into a digital 
waveform which is ultimately converted into an acoustic wave. 
15 Before its conversion, the initial digital waveform may be 
further processed in accordance with methods which will be 
familiar to persons skilled in the art. 

The phoneme set used in the preferred embodiment 
conform to the SAMP-PA (Speech Assessment Methologies 
20 Phonetic Alphabet) simple set number 6. It is to be 
understood that the method of the invention is carried out in 
electronic equipment and the phonemes are provided in the 
form of signals so that the method corresponds to the 
converting of an inpur waveform into an output waveform. 
25 The preferred embodiment of the invention converts 

waveform representing strings of one, two or three phonemes 
into digital waveform but it always operates on strings of 
five phonemes so that at least one preceding and at least one 
following phoneme is taken into account. This has the effect 
30 that, when alternative strings of five phonemes are 
available, the "best" context is selected. 

It has just been explained that this invention makes 
articular use of a string of five phonemes and this string 
will hereinafter be called a "context window" and the five 
35 phonemes which constitute the "context window" will be 
identified as PI, P2, P3, P4 and P5 in sequence. 
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digital wave forms; 
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and it is always possible to achieve an exact match for at 
ieast one phoneme, in the preferred embodiment it i. always 
possible to match exactly P3 of the data context window with 
P3 of the stored context window but, in general, further 
5 exact matches may not be possible. 

This invention defines a correlation parameter between 
two phonemes as follows. Corresponding to each phoneme there 
is a type-vector which consists of an ordered list of co- 
efficients. Each of these co-efficients represents a feature 
10 of its phoneme, e.g. whether its phoneme is voiced or 
unvoiced or whether or not its phoneme is a silibant a 
plosive or a labil. It is also desirable to include 
locational features, eg whether or not the phoneme is in a 
stressed or unstressed syllable. Thus the type vector 
15 uniquely characterises its. phoneme and two phonemes can be 
compared by comparing their type-vectors co-efficient by co- 
efficient; e.g. by using an exclusive-or gate (which is 
sometimes called an equivalence gate,. The number of 
matchings is one way of defining the correlation parameter. 
If desired this can be converted to a percentage by dividing 
by the maximum possible value of the parameter and 

multiplying by 100. 

(As an alternative, a mis-match parameter can 
defined e.g. by counting the number of discrepancies in the 
two type vectors. It will be appreciated that selecting an 
■ best" match is equivalent to selecting a lowest mis-match. , 
The primary definition relates to the correlation 
parameter of a pair of phonemes. The correlation parameter 
o£ a string is obtained by summing or averaging the 
30 parameters of the corresponding pairs in the two strings. 
Weighted averages can be utilised where appropriate. 

"""""^ t he preferred embodiment, the database is based on 
an extended passage of the selected language, eg English 
35 (although the information content of the passage is not 
mportant,. A suitable passage lasts about two or three 
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The 



10 



'minutes and it contains about 1000-1500 phonemes 
precise nature of the extended passage is not particularly 
important although it must contain every phoneme and it 
should contain every phoneme in a variety of contexts. 

The extended passage can be stored in two different 
formats. First the extended passage can be expressed in 
phonemes to provide the access section of a linked database. 
More specifically, the phonemes representing the extended 
passage are divided into context windows each of which 
contains 5 phonemes. The method of the invention comprises 
obtaining best matches for the data context windows with the 
stored context windows just identified. 

The extended passage can also be provided in the form 
of a digitised wave form. As would be expected, this is 
achieved by having a reader or ; reciter speak the extended 
passage into a microphone so as to make a digital recording 
using well established technology. Any point in the digital 
recording can be. defined by a parameter, e.g. by the time 
from the start. Analysing the recording establishes values 
for the time-parameter corresponding to the break between 
each pair of phonemes in the. equivalent text. This 
arrangement permits phoneme-to-waveform conversion for any 
included string by establishing the starting value of the 
time-parameter corresponding to the first phoneme of the 
string and the finishing value for the time-parameter 
corresponding to the last phoneme of the string and 
retrieving the equivalent portion of database, xe the 
specified. digital waveform. Specifically a conversion for 
any string of one, two or three phonemes can be achieved. 

The important requirement is to select the best 
portion of the extended text for the conversion. 

It has already been mentioned that the phoneme version 
of the extended text is stored in the form of context windows 
each of five phonemes. This is most suitably achieved by 
35 storing the phonemes in a tree which has three hierarchical 
levels. 

The first level of the hierarchy is defined by phoneme 
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P3 of each window. The effect is that every phoneme gives 
direct access to a subset of the context windows ie. the 
totality of context windows is divided into subsets and each 
subset has the same value of P3. 

The next level of the tree is defined by phonemes P2 
and P4 and, since this selection 'is made from the subsets 
defined above, the effect is that the totality of context 
windows is further divided into smaller subsets each of which 
is defined by having phonemes P2, P3 and P4 in common. 
(There are approximately half a million subsets but most of 
them will be empty because the relevant sequence P2, P3, P4 
does not occur in the extended text). Empty subsets are not 
recorded at all so that the database remains of manageable 
size. Nevertheless it is true that for each triple sequence 
?2, P3, P4 which occurs in the extended text there will be a 
subset recorded in the second level of the database under P2, 
P4 which level will also have been indexed at the first level 
under P3. 

Finally the second level gives access to a third level 
which contains subsets having P2, P3 and P4 as exact matches 
and it contains all the values of PI and P5 corresponding to 
these triples. Best matches for data PI and P5 are selected. 
This selection completely identifies one of the context 
windows contained in the extended text and it provides access 
to time-parameters of said window. Specifically it provides 
start and finish time-parameters for up to four different 
strings as follows: - 

(a) P3 by itself; 

(b) the pair of phonemes P2 + P3; 

(c) the pair of phonemes P3 + P4; and 

(d) the triple consisting of the phonemes P2 + P3 
+ P4. 

In the first instance, the database provides beginning 
and ending values of the time-parameter corresponding to each 
one of the selected strings (a) - (d). As explained above, 
the 'time-parameter defines the relevant portion of a digital 
wave form so that the equivalent wave form is selected. 
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It should be noted that item (d) will be offered if it 
is contained in the database; in this case items (a), (b), 
and CO are all embedded- in the selected (d) and they are, 
therefore, available as alternatives. If item (d) is not 
contained in the database then, clearly, this option cannot 
be offered. 

Even if item (d) is missing from the database, then 
items (b) and/or (c) may still be present in the database. 
When both of these options are offered they will usually 
arise from different parts of the database because item (d) 
is missing. Therefore, depending on the content of the 
database, the selection will offer (b) alone, or (c) alone, 
or both (b) and (c). Thus the selection may provide a choice 
and in any case item (a) is available because it is embedded 
in the pair. 

Finally, even if (b), (c) and (d) are all absent from 
the database, item (a) will always be present and thus "best 
match" will be offered for the single phoneme and this will 
be the only possibility which is offered. 

It will be apparent that items (b), (O and (d) imply 
that strings will overlap. Thus whenever item (c) is 
selected for any phoneme then item (b) must be available for 
the next phoneme. If nothing better offered, then the same 
parr of the database will meet the requirements of (c) for 
. 'the earlier phoneme and (b) for the later but because 
different correlations are involved better choices may be 
selected. It will also be apparent that whenever item (d) is 
available item (c) will be available for the previous phoneme 
and, in addition, item (b) will be available for the 
) following phoneme. In other words, some of the strings will 
overlap, ie there will be alternatives for some phonemes such 
that the same phoneme occurs in different places in different 
strings. This aspect of the invention is described in 
areater detail below. 
5 " it has been emphasised that the preferred embodiment 

is based on a context window which is five phonemes long. 
However the full string of five phonemes is never selected. 
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Even if. tortuously, the input text contains a string of 
five found in the database only the triple string P2. P3. P4 
will be used. This emphasises that the important feature of 
th. invention is the selection of a string fro* a context 
and. therefore, the invention selects the -best" context 
Window of five phonemes and only uses a portion thereof in 
order to ensure that all selected strings are based upon a 



context. 
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The analysis of the text into phonemes contained in 
the database is carried out phoneme by phoneme, but each 
phoneme is utilised in its context window. The next part of 
"the description will be based upon the seiection procedure 
for one of the data phonemes it being understood that the 
same procedure is used for each of the data phonemes. 

" The selected data phoneme is not utilised in isolation 
but as part of its context window. More precisely the 
selected data phoneme becomes phoneme P3 of a data window 
with its two predecessors and' two successors being selected 
„ provide the five phonemes of the relevant context window 
The database described above is searched for this context 
window; Since it is unlikely that the exact window will be 
/coated, the search is for the best fitting of the stored 
context windows. 

; The first step of the search involves accessing the 

tree described above using phoneme P3 as the indexing 
element. As explained above this gives immediate access to 
a subset of the stored context windows. More specifica ly 
accessing level one by phoneme P3 gives access to a list of 

0 Phoneme pairs which correspond to possible values o nd 
~P4 of the data context-window. The best pair is selected 
accordinq to the following four criteria. 

aCCOrd 1 ^.. ^terion Portuitously , it may happen that one 
pair ^sub-set gives an exact match for data « : and P . 
; 5 When this happens that pair is selected and t,e 

immediately proceeds to level 3. This outcome is unliKely 
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Th e third er.ter.on is similar ^^"J* 
l0 that lt la . Tr a ;: P- r re 9 r s rto level 3 is 

parameter. ^here is no march for 

criterion four occurs when there^ 
15 either P2 or P3 in which the case ^the P ^ ^ 

highest average correction parameter 

ba sis of access to level 3 ion , SUCC eeds, then it 

20 right-hand pair and a smgx 

criterion 2, 3 and 4. ^.^ pQSSlble that 

Even if criterion 1 fails, 
. ief t-hand pair will * -und hy «it».^ < be 

possible that, simultaneously, a ^J^J , has faile d 
• ? However because criterion 
25 found by criterion 3. Howe the data base 

rr-" "^ir-"SL- , -~ - »• - 

phoneme P3 cannot 

in other context windows. ^ utili sed there will 

Thus , when criterion 1 or ^ ^ 

only be access to one portion of the tr ^ 
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we h ave now described how the selection of . content 
window gives rise to either one or two areas of the third 
Tev o, the tree. In each case the third, level may con ain 
v ral fairings for phone.es 1 and 5 of the data conte t 
5 : ndow. The pair with the hest average correlation paramet 
1S selected as the context window in the access port, 
the database. As explained above this context window is 
converted to digital wave for- using the time-parameter. 

To re-emohasise; where criterion 1 is used only one . 
10 context window is selected but it gives rise to four 
possibilities, namely time-parameter ranges for: - 

(i) the triple P2 . P3 * P4; 

(ii) the left-hand pair ?2 * P3; 

(iii) the right-hand pair P3 * P4, and; 
15 (iv) the single P3 by itself. 

Wh en criterion 2 operates, this provides time- 
parameter ranges only for the left-hand pair P2 -P3 and or 
- I single P3 by itself. When criterion 3 operates mi a 

_ considerations apply but the parameter ranges are r the 
. 20 riaht-hand pair P2 ♦ P3 and for the single P«. If both 
criterion operate this offers two choices for — - 
and only the one with the higher correlation parameter for 
+ P5 is selected. 

finally when criterion 4 operates there only one 
25 oossibility namely the phoneme P3 by itself. 

The description given above explains how conversions 
are provided for each phoneme of an input text. Sometimes 
are provxuc sinqle phoneme 

the method provides a conversion for only a sing P 

in this case, no alternatives are offered. in some 
es the te-od provides conversion for strings of two or 
in these circumstances, tne 
three adjacent phonemes and in these 

conversion provides alternatives for at leaSt ° P 
In order to complete the selection, it is necessary to redu 
In order to . oreferred method of 

the number of alternatives to one. The prete 
5 achieving this reduction will now be explained. 

The preferred method of maKing the reduction is 
carried out by processing a short segment of input text, eg. 
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a segment which begins and ends with a silence. Provided it 
is not too long a sentence constitutes a suitable segment. 
If a sentence is very long, e.g. more than thirty words, it 
usually contains one or more embedded silences, eg between 
5 clauses or other sub-units. In the case of long sentences 
such sub-units are suitable for use as the segments. 

The processing of a segment to reduce each set of 
alternatives to one will now be described. As mentioned, no 
alternative will be offered for some of the phonemes and, 

10 therefore, no selection is required for these phonemes. 
Alternatives will be available for the other phonemes and the 
selection is made so as to produce a "best" result for the 
segment as a whole. This may involve making a locally "less 
good" selection at one point in the segment in order to 

15 obtain "better" selection elsewhere in the segment. The 
criteria of "better" include: - 

(i) taking longer strings rather than shorter 
strings, and 

(ii) selecting from strings which overlap rather 
than from strings which merely abut. 
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The rejection of unwanted alternatives produces a 
position in which each phoneme has one, and only one, 
conversion. In other words the input text will have been 
divided into sub-strings of 1, 2 or 3 phonemes matching the 
database and the beginning and ending values for the selected 
streams will therefore be established. The output portion 
of the database takes the form of a digitised waveform and 
the parameters which have been established define segments of 
this waveform. Therefore the designated segments are 
selected and abutted to produce the digital waveform 
corresponding to the input text. This completes the 
requirement of the invention. 

Having obtained a digital waveform this can be 
provided as audible output using conventional digital to 
analogue conversion techniques and conventional loudspeakers. 
If desired, the primary digital waveform can be enhanced 
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".gure i m-rates — — * 

5 - — ^- - illustrated in 

Figure 2 snows a 

Flgure , attached to a telephone -«°f engine 

As shown in Figure 1 the sp ^ 
10 th .i»v.»tionc W ri...P=i^^«"; roauc . therefrom an 

to accept text in "* ^.\. xt is passed to . 

e5 uivalent text in asso ciated with a database 

converter 12 which is conver ter 12 matches 

l3 in accordance with egments stored in the 

segments of the phoned text » ^ q£ 

access portion of ^ are assembl ed into extended 

waveform are retrieved and these «• tQ extende d 

portions of digital waveform 

portions of the original inpu . w ,„ for » are passed 

These extended portions of dig ed t0 fur ther 

t0 waveform processor U where the, e ^ ^ ^ 
processing m order to produce , z . ^ 
aigi tal output is conve rted nto transnis sion. 
1S provided at output port 15 f ^ 

As shown in Figure 1 the sp which hol4s 

receive its input from database 16 is 

texts in conventional °»*ho,"^y- select a t „ t stored 

conveniently operated * ^f,^ ^ ^ primary 
in database 16. This "« ^ P output port 15 as an 
30 converter U and it appears at 

a nalogue waveform. as illu strated m 

Fig ure 2 shows a .p..« * net „ orll . A s 

ngu re 1 speech telephone 20 is 

shown in Figure « ^ ft neMork 21. 

35 connected to a station 22 via ^ , and 

Station 22 includes a speech engine ^ ^ th . 

the output port 15 is connected to 
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,, a P vrernal database 16 can be 
information available in the vernal telephone 
prided, as an analogue acoustic waveform, to 

20 ' M desired the Keypad (used for dialling, of the 
tel ep h one * can be used as the 

- read b r 

preferably _ inpler technical arrangement provides a 

speech engine ,. 20 and the human operator 

human operator at the •"« oa „ ith instructions 

actuates the Keyboard 17 xa «eor ^ 
received over the network When P ^ ^ 

a portion of text this » " » is unnecessary. Thus 
further participation by *h. op »t ^ 
the operator is freed to assist wi e££ici e n cy of the 

s the use of a speech engine enhances the efficiency 

operation. othe r 

It will be appreciated that tnere 
e.piicatLs for a speech engine according to 
e „. it is suitable for connection to P 



• g 

system. 



^ PCT/GB94/01688 

95/04988 ^ 

- 14 - 

mto an output 

. , A method of ^ ^ :r ents a te X t i- 

dividing said -P ut 519 aocess section of a 

each o£ which i. stored in 

lin)t ed database, . _ ified in step (a) retrieving 

(b) for each segment identify seotion 

~ - Mhi ' h 

linKed to the input segment, a ^ 

- ^trj^z- - *» - e order 

iHhe e q uivalent input i§ a waveform 

, whereby the concatenated digi 1 rised in t hat the 

.responding to the input ^signal ^ ^ digita l 
output section o£ the aatab3S * °° ne „ r £or identifying any 
20 wavtform having a locat- of beginn ing and 

point therein whereby the est ^ ^ extended 

"ending location parameters define P est ablishing 
digital waveform, -and S «J for segments of the 

Jginning and ending location param ^ 
» ^ TuXZZZZ \ P«*~ of stored digitai 

established in l«i 
waveform. 

! wherein step (a) 
A method according to c aim , ^ o£ 
Uprises comparing r n r : : nput to . ^ 

30 the input section of the 

, (m 2 wherein each window has 
A method according to claim 2, 
a length .equivalent to 5 phonemes. 
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(in) a lowes window, whereby 

= fifth phonemes or tne 

— r:L°irrc 0 ^se S -.^ r * r ::::: 

* fK D i nnut window from the first 
for the central phoneme of the input wi 

level of the hierarchy, '"'^'^'X^Ji^v, 

5 2 and 4 fro, the second level of e hrerar V 

— -rLr h 1 ::: 1 p ~ s c— . - 

0 Lection in the second level of the hierarchy. 

5 A database for use as a component of a speech engine 

said database having an output ^^ZZ^ said 

extended digital » a ' e i4ent;ify ing common points in both 

25 a common address parameter identify 1 ^ 
sections whereby the identification of a -9 

access section -^-^Z:l :^ segment 
the parameter and hence identifies the 
of the digital waveform. 

portion contains widows of 5 P _ fay ^ 

— n r:: : £ rr;:;;: iiei:i: y - second - 

centre phoneme of a wmao in the higher 

«-p a window whereby entries m 
phonemes ot a , of 3 phonemes 

h , ral lev€ i are eauivalent to strings ot P 
35 hierarchical xevex 
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• ~= a lower hierarchical 

for convert^, a text in 9 P £or convert ing said 

«xt in phonemes and a conver chataoterised in 

te xt in **°™°* in ~\:Z ies . da ta»ase ,13, according to 
10 that the converter (12) 

either Claim 5 or Claim 6. 



according to Claim 7, said p the spQech 

the network for the transm-sxon of the 
engine to a remote location. 
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• , JWSS«on.l Bureau 
[received by the i n ^ M,a 0 1.95); 

aignal. serein sal* -put signal 

rr^rtri ~— „ 

text , wherein said - th f .^f ^ into segments each of 

i. linked to the input segment an step 

(c) jo ining the "^n^l <«*.. - *»» 

(b), said segments be^ng kept in 

equivalent input eegments ^ ^ Mave£orm 

whereby the results =^ characteriBe d in that the 
corresponding to the input an eJtte nded digital 

,0 waveform having a location * o£ beginn ing and 

point therein whereby the estab ^ ^ 

ending location parameter* define a p ^ establishin , 

digital waveform, and step 1 segments of the 

ginning and ending location pa me t«. ^ 

established in (a) ror 
waveform. 

«■ claim 1, wherein step (a) 
2 . A method according to h windows of 

uprises comparing so ^ . ^ 

30 the input section of the .da 
ma tch for the input signal. 

i 4. 2 wherein each window has 
3 A method according to claim 2, 

: length equivalent to 5 phonemes. 
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, A method — , to claim 3- in ^ 

section of the database is organised ,nto three 
le vels; namely ^ ^ contai ning single phoneys 
, n <. he central phoneme of a window; 

_ . ar , fl fifth phonemes of tne wn«*w 
the first and fittn p t level identifies a 

L0 identification of a portion of the lowest 

stored window of phonemes; _ match 

and the matching comprises selecting an ex , 
* +-v,o inout window from the first 
for the central phoneme of the input Ji 

le vel of the hierarchy. ^^^X corresponding 
15 2 and 4 fro, the second level of the hiera" V 

20 leHction in the second level of the hierarchy. 

aigital waveform and ^ " ' ^^^ in phonemes, said 
representing said extended digital wav comnon 

- p::r\= a =: d =S r — ;u 

0 :rre!ponding segment of the digital waveform. 
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and said access portion ^ 

le vel accessed by a string of ^ P ^ ^ lower 

5 a sp eech engine -n. 0 ^'^^'* 

,U) for converting a text in 9 s converting said 

LI i» pnone.es and a ^£*^L.. characterised in 
text in phonemes -;;^ alatrtM e ,13i according to 

10 that the converter 112) inclu 
either Claim 5 or Claim 6. 

aes a speech engine 
, A telephone networ* which ^ ^ 
according to Claim 7. said ^ speech ^ ^ ^ speech 

th. network for the transmission 
engine to a remote location. 

, A method of converting -^^Z^"^ 

signal, wherein said input signal rep 

Tonemes and said -^^^ 3 corresponding to said 
convertible into an acoust^ - ^ database 

t ext, wherein said 8trings of phonemes and an 

naving an access section based ^ correspo nding to 

output section ^^ 2^ ^Tt 
th e linked access sections w ^ ^ best 

etching a segment of said nput , saia „ 

, natch of strings contained « lsstt one internal 

ma tch including an exact ««h ^ ^ phQnemes 

pho neme and *^" 4 ^ ^f/v a sho rter string of phonemes 

. _ J A r>fl udeS 
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